A robust/fast spoken term detection method based on a syllable n-gram index with a distance metric

نویسندگان

  • Seiichi Nakagawa
  • Keisuke Iwami
  • Yasuhisa Fujii
  • Kazumasa Yamamoto
چکیده

For spoken document retrieval, it is crucial to consider Out-of-vocabulary (OOV) and the mis-recognition of spoken words. Consequently, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a Japanese spoken term detection method for spoken documents that robustly considers OOV words and mis-recognition. To solve the problem of OOV keywords, we use individual syllables as the sub-word unit in continuous speech recognition. To address OOV words, recognition errors, and highspeed retrieval, we propose a distant n-gram indexing/retrieval method that incorporates a distance metric in a syllable lattice. When applied to syllable sequences, our proposed method outperformed a conventional DTW method between syllable sequences and was about 100 times faster. The retrieval results show that we can detect OOV words in a database containing 44 h of audio in less than 10 m sec per query with an F-measure of 0:54. 2012 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection by N-gram Index with Exact Distance for NTCIR-SpokenDoc2

For spoken term detection, it is very important to consider Out-of-Vocabulary (OOV). Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a very fast Japanese spoken term detection system that is robust for considering OOV words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram index of syllables in...

متن کامل

Sopoken Term Detection Based on a Syllable N-gram Index at the NTCIR-11 SpokenQuery&Doc Task

For spoken term detection, it is crucial to consider out-ofvocabulary (OOV) and the mis-recognition of spoken words. Therefore, various sub-word unit based recognition and retrieval methods have been proposed. We also proposed a distant n-gram indexing/retrieval method for spoken queries, which is based on a syllable n-gram and incorporates a distance metric in a syllable lattice. The distance ...

متن کامل

Fast subword-based approach for open vocabulary spoken term detection

This paper describes an efficient two-stage approach using sub-phonetic segment N-gram index and shift continuous dynamic programming for open vocabulary spoken term detection. With this two-stage search, we attempt to improve performance in both retrieval accuracy and process time. In the speech recognition process, a more sophisticated subword that is shorter than phonemes is used to minimize...

متن کامل

Metric subspace indexing for fast spoken term detection

In this paper, we propose a novel indexing method for Spoken Term Detection (STD). The proposed method can be considered as using metric space indexing for the approximate stringmatching problem, where the distance between a phoneme and a position in the target spoken document is defined. The proposed method does not require the use of thresholds to limit the output, instead being able to outpu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2013